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ABSTRACT 

This paper discusses a model of information storage 
and retrieval, the k-d tree (Bentley, 1975), a binary, hierarchical 
tree with multiple associate terms, which has been explored in 
computer research, and it is suggested that this model could be 
useful for describing human cognition. Included are two models of 
human long-term memory — networks and hierarchies—and reasons are 
given for the higher efficiency of hierarchical theories, including 
the k-d tree. A description of the k-d tree includes its structure, 
computation rates, and balancing (branching); its applications to 
human cognition, including a comparison with Piaget's notions of 
equilibrium and cognitive stages of development; application to 
memory and forgetting theories; covergent and divergent thinking 
processes; logic paths and decision-making; and the function of 
sleep. A concluding discussion compares human and computer processing 
of information, and raises questions related to the hierarchical 
structure of brain activity. It suggests that the k-d model from 
information science may have a strong relevance to the study of human 
cognition, particularly regarding the study of memory and sleep, 
while at the same time allowing for vast differences between a 
computer and the human mind. (JB) 
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The k-d Trees A Hierarchical Model for Human Cognition 

Information processing? whether in humans or machines* 
comprises two separate fields 2 hardware and software. Hardware 
refers to the physical storage processes and the interconnections 
between areas. In psychobiology , this means the physical 
interconnections of neurons and an traneuronal changes that are 
affected by learning. Software refers to the ways in which data 
are referenced, accessed and man i pulated. Hrirdware deter mi nes 
what kino of computations can be done; software determines how 
computations are in fact done. 

Tnis paper considers only software' queslrronsT Ft proposes a 
model of information storage and retrieval, the k-d tree (Bentley 
i^?75>, tnat has been explored in computer research and that 
should d© a useful model for describing human cognition. 

Basically two models of human l->ng~term memory retrieval 
have been put forth, networks and hierarchies. Each addresses 
one of the two basic information problems that must be 
considered. Each has advantages and disadvantages. 

The network theory is thought to be best in describing the 
multiplicity of connotations that any one image, work or concept 
evokes in human thinking. Quillian's Teawhable Language 
Com pr en en der (1966) was one of the first theories. Anderson's 
ACT theory (1976) is probably the most well known and discussed. 
Network theories are generally inadequate in their explanations 
of the speed of human information processins; nor 
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do they necessarily explain why some concepts are more readily 
accessed than others. ACT theory* for example? asserts that both 
speed of retrieval as well as accessabi 1 i ty can be explained by 
the strength of the links between nodes. Strength is directly 
related to the number of times a link has been executed or used. 

In computer science terms? network models invariably 
generate classes of problems that are NP complete. NP 
completeness refers to the length of time it would take to 
recover information. To say that retwork models are NP complete 
means that a non-deterministic machine will compute an answer in 
^.UaJL--X^.e> - Pin) is a polynomial dependent on the number of nodes 
used in the network! Although the non-determinism constraint can 
be removed by the parallelism of function in brain-ceils, 
computations bound to run within polynomial time do pose 
troubling question^ about the limiting speeds at which a brain 
would function. Because of the existence of closed loops within 
the network, some concern must also be raised about the mechanism 
used to terminate a thought process once it has begun. 

Hierarchical theories, on the other hand, are far more 
efficient data structures for retrieving and, storing data. 
Computer studies of k-d trees have been limited to the hardware 
search assumption of sequential processing. Even so. the worst 
retrieval time bound is proportional to n raised ro a fraction 
power, orders of magnitude faster than the. network model. 
Only one of the computational processes used in a k-d tree is 
relatively computationally inefficient in that it is proportional 
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to n log n. this process is tree re-organ i zat ion , also referred 
to as balancing in this paper. Re-organ i sat ion is a time 
consuming process. Th is paper sugests that it occurs over 
significant periods of time, possibly during sleep. The 
hierarchical model also provides a natural bounding mechanism, 
the depth of a tree* that woul d serve to 1 imi t the amount of 
computational resources needed for a given problem. Thus* the 
k~d tree is an efficient model. 

The k-d tree model 

The k-d tree is a binary tree. Although a similar structure 
called a Quad tree (Bentley, Stanat, 1975) allows for a multiway 
branching, it is more difficult to program and less general 
computer science claims have been applied to it to date. 

Structure. The k-d tree is composed of decision points that 
represent stored information. The information is first quantified 
into k-tuples. Each of the k cross-product domains is a different 
dimensional axis along which the data has been collected. 
Information stored within these trees is categorized by the 
context of preceeding information in a most interesting way: 
every node of a k-d tree orders the information stored below it 
in the tree on the basis of one of its k dimensions of 
information. 
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This produces a tree that does not cluster concepts 
together; rather it tends to spread them throughout the tree in 
paths that are determined by the order in which the information 
was originally assimilated* These learning dependent associations 
can account for "insight" or "intuition" since the search process 
is enriched by the. chance association of concepts that preceeded 
the assimilation of new data and by the tree reorgan 1 rat ion 
process » In this model concepts woul d cont 1 nue to be i nf luenced 
by initial associations long after the concept had been learned. 

This model can provide more concise descriptions of human 
cognitive processing than have been previously available, 
Pr&vious hierarchical theories were not able to explain 
parsimoniously the multiplicity of associations that are attached 
to any one word or concept. The k-d tree structure solves this 
problem* because of the k dimensions associated with each node. 

Computat ion rates. The k-d tree is the most effective and 
general software mechanism known for solving associative queries. 
No other software data structure that has been suggested applies 
to every category of associative query: exact match search; range 
search; partial match search; and nearest neighbor search. The 
performance characteristic of k~d trees on sequential process 
machines which are far less efficient than the brain, is 
surprising. 

To categorize the relative speed of computational processes, 
an approximation notation 0(f(rf>) is used where f(n) is a 
function of the positive integers n; 0(f(n)) stands for a 
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quantity that is not explicitly known except that its magnitude 
is at least as large as f(n). More precisely, there is a positive 
constant m such that the number xtnD represented by 0(f(n)> 
sat isf ies the condition I n D I < ml f (n ) I , for all n >= i. The 
constants m and 1 are not specified and W3.ll differ for each 
approximation . 

The average jnnmg time for sequential k-d tree algorithms 

with k a\mensions, has been shown to be (Bent ley, 1975>* 

(k-\)/k 

insertion, 0( log n); deletion of the root, 0<n in ' n ); deletion 
of a random node, O(log n); and optimization (guarantees 
logarithmic performance of searches), 0(n log n). For nearest 
neighbor searches the empirically observed average running t.xme 

is O(log n). For partial match queries with t keys specified has 

( k-i)/K 

a maximum running time is 0(n ' ). All of these performances 
were presented in Jon* Bent ley 's original paper on k-d trees and 
all of them either surpass or equal all other known algorithms 
for tnese Tasks. No other computer stored structure has been 
found that is either as versatile or as efficient for all forms 
of assoc i at i ve query . 

Balancing, The k-d tree is most efficient when it is 
balanced; that is, when it is most bushy. This is formally 
described as a tree with no more than one level difference 
between the bottom most nodes. Only in this case are the 
excel lent performance characteristics strictly true. As the tree 
becomes increasingly less balanced, the performance 
characteristics deteriorate and the response time needed to 
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answer associative queries eventually becomes 0(n>* Long before 
this point is reached, it is advantageous to rebalance at least 
part of the tree. An algorithm which has optimal 0(n log n) 
characteristics is known that will rebuild a balanced tree. For 
large trees? however, the need for a complete rebalancing will be 
infrequent . 

Appl 1 cat ions to human cogni t ion 

Thus we have in the k-d tree, an extremely fast, organized 
and associative data structure. Certain aspectts of this model 
seem particularly relevant to research and theory about human 
learn ing . 

F i aget 1 an theory . The balanced hierarchical tree is 
certainly relevant to Piaget's notion of equilibration. While 
Pi aget sometimes seems to emphasize the congruence aspect of 
equilibration, he also included in his notion that of 
hierarchical order and categorization. Assimilation could then 
be defined as the adding of information to the tree, without an 
attempt to rebalance it. However, as the tree becomes more and 
more unbalanced, di ^equi 1 i brat ion is reached. Finally, the tree 
will be rebalanced, at the expense of time and energy, and 
accomodation is achieved. For the k-d tree this is a 
computational process requiring an average running time of 
0(n log n). 

It should not be expected that the entire tree will usually 
be rebalanced. Typically, accomodation will take place in only 
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one part of the tree. This would account for decailage, the 
uneven cognitive development found in most children. However, 
occasionally partial rebalancing will not be a satisfactory 
solution, and the whole tree must be resorted and the data 
recategonzed. Piaget ' s concept ion of cognitive stages of 
development seems logically to parallel the idea of rebalancing 
the entire tree. The concrete operational stage seems 
particularly to represent the completely rebalanced tree that 
results in an integrated hierarchical structure. In that period, 
children are able to understand sets and subset memberships. 
Formal operations as a stage may not necessitate a total 
rebalancing; rath&r it might be that formal operations require 
the addition of a level of hypothesizing at the top of the tree 
rather than a reorganization at the bottom of the tree. 

Similar comparisons may be made for the first two of 
Piaget 's stages. The beginnings of representational thought 
which occur at the end of the sensorimotor stage can usefully be 
thought of as the formation of the first hierarchies. Initially, 
schema are discrete memories, but then they are organized into 
related concepts. Preoperational thought, like formal 
operations, would be marked by the addition of conceptual levels 
at the top of the tree. 

This model can provide a basis for exploring the hypotheses 
of Piaget in both computers and children. A computer simulation 
could be designed to accept new information without rebalancing 
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for a particular time period. Then it would be allowed to 
rebalance, and we could trace the changes in cognitive structure. 
Depending on the particular time period, we could expect either 
the development of superor dinate concepts or realignment of 
cogni 1 1 ve categor les. 

In children, decalage should be most apparent in rather 
unrelated concept areas. Furthermore, accomodation, as well as 
stage transitions could be studied to see if new 
information alone creates growth or rather if a new way of 
conceiving what is already known is necessary. For example, a 
new area of information could be presented. If this model is 
correct, it should be expected that initially the learner wall 
try to incorporate the information with what he already knows. 
We should expects gaps in information as well as misinformation. 
However, as the learning process orogresses, there should be 
detectable moments when the information has been recategor ized, 
less information is lost, and misinformation is minimized. 
Finally, if enough partial balancing is made necessary, we would 
expect a total rebalancing resulting in an integration of the 
newly organized information with older, more established 
concepts. Incongruities would be noticed and resolved. It 
should be expected that this total rebalancing would manifest 
itself suddenly as a consequence, of a median k-tuple (concepx) 
achieving its place near the root of the knowledge tree. 

Memory and Forgetting. It is semantic memory which seems 
most suitable for exploration with the k-d tree model. Because 
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semantic memory is dependent upon context and multiple 
■references, its structure should parallel fairly closely the 
structure of the tree. 

Work by Collins and Quillian (1969) suggests that highly similar 
concepts are retrieved and compared more quickly than less 
similar ones. These studies offer the beginnings of support to 
this model . 

It is unclear whether episodic memory can also be explained 
by this model. However, some evidence suggests that the tree, 
structure might be a useful model, if we hypothesize a separate 
tree or subtree whose key indices are contextual cues. Each cue 
would be linked to all the relevant data associated with the 
remembered event. When a cueing stimulus is perceived, it would 
elicit the remembered event, or one which is perceived by the 
individual as a deja vu experience. 

The fc-d tree also can give an explanation for why mnemonics, 
such as the method of loci, work. The method of loci requires 
the individual to associate terms or concepts with a well known 
physical terrain. He theh can recall concepts more readily if he 
-walks through" the terrain in his imagination, and remembers the 
associates of the ^physical characteristics he is "seeing*. In 
k-d terms? he has added an extra characteristic to the nodes in a 
subtree he has already establ ished. Because this subtree is 
already well established*, it is easy to retrieve, along with the 
associated concepts or ter^ms. 
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Inhibition* both retroactive and proactive, can also be 

described in terms of the tree. First* we assume that partial 

rebalancing goes on fairly frequently. We also assume that 

interruptions of the rebalancing process take place fairly often. 

J. Vandendorpe (19B0) has shown that an interruption of 

balancing results in lost information; inhibition occurs when 

such an interruption happens,. When it is new information which 

is being placed in its proper node, the new information may be > 

lost. When the tree is being rebalanced* links to old 

information may be lost. It must be remembered that the data 

records are not destroyed, but that the ability to retrieve the 

records is lost. This could be why seemingly forgotten memories 

can be elicited by direct neuronal stimulation. Furthermore, 

depending upon whether or not there are many pointers to the 

& 

missing information, the person may or may not realize that he 
has f orgot ten anything. 

The k-d model is also supported by the research which 
generally supports the retrieval-failure theory of formatting, 
most often used to describe short-term memory failure. Dillon's 
(19^3) research on the accessibility of items supports the idea 
that the difficulties in retrieving infomation lie not in 
deciding whether the information is pertinent, but in finding the 
inf or mat ion* 

The relatively greater impact of proactive inhibition as 
described by Kfeppel and Underwood (1962) can also be understood 
in terms of this model. New information is more vulnerable to 
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forgetting because fewer associates have been developed, and this 
is especially so when the associates used are similar or even 
identical. Proactive inhibition can be diminished, however, if 
we make use of a totally different class of associates. Wickens 
(l c ?72) reported a demonstration of the "release from proactive 
ir.w *bition" effect. Furthermore, research by Dillon and Bittner 
(1^75), Gardiner, Craik and Birtwistle (1972) and O'Neill, 
Sutcliffe and Tulving (1976) all suggest that presenting a new 
subcategory to the subject reduces proactive inhibition. 

Retroactive inhibition is less powerful because old 
information is likely to have many referents, and loss is likely 
to be more often detected, if not always corrected. If this is * 
true, it should be found that material forgotten because of 
proactive inhibition is more often totally lost or inaccessible. 
The individual should more often not even know he ha^s forgotten 
something. Material forgotten because of retroactive inhibition 
should be more easily restored — or at least the subject should 
realize he has forgotten something! 

Convergent and Divergent Thinking. Convergent t h *n king, 

or the process which results in the single correct answer, is 
likely to be a rather simple process in terms of k-d trees. 
Convergent thinking would occur when the tree is searched in the 
typical, top-down manner. If the correct decisions are made at 
all the nodes., the correct answer is retrieved. 

Divergent thinking could be defined in either of three ways 
with this model. The person could access the tree in the normal 
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manner, but when she has reached one correct conclusion, she 
could re-access the tree, making different decisions at some of 
the nodes. A second approach would be to use the referent nodes 
found along the path to the first correct conclusion +o access 
the tree in a horizontal reference pattern. This approach would 
allow for bottom-up as well as top-down searching. A final way 
that divergent thinking might be explained is in the concept of 
rebalancing. Divergent thinking would be the product of resorted 
and recategorized data organizations. The need for an incubation 
period in the creative solution of problems would suggest that 
the last definition is a most - appropriate one. 

Logic paths and decision making. The structure of the k-d 
tree, as determined by the order of the presentation of 
information, produces some interesting ideas. In accessing a 
datum, the path taken provides associations, but it also is in 
itself a structure or logic path. This logic path could 
constitute the formation of implicit rules which have been 
examined in many recent learning studies. 

This same logic path structure can model the effects of 
instructional methods and examples that have been found to be 
strongly influential in decision making speed and accuracy. In 
the k-d structure, instructions and examples affect path choices 
at the root of the tree, and thus can determine what information 
is accessable. and what kinds of decisions are possible. 

Th e function of sleep. Some researchers have identified 

the REM state with information processing, such as sorting, 
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coding and referencing. Recently, Crick (1983) has taken the 
opposite view that dreaming is actually a forgetting process, or 
at least one in which irrelevant images are discarded. If we 
conceive as the waking state as one in which information is 
assimilated, creating morr* and more unbalanced trees, then REM 
sleep could be the t ime in whi ch major rebalancing efforts are 
made. If this is true, then the currently opposing views can be 
reconciled. Sorting and coding does take place, as well as 
inhibition and actual removal of information. There might be a 
loosely affiliated subtree which retains daily memories, and 
which is resorted and integrated into the main conceptual tree. 
If this is true, proactive inhibition should be stronger for 
material that is followed by a sleep^ period. 

Does the remembered dream have a relation to this 
rebalancing process? The dream might be a reflection of the 
material in the temporary storage area which is waiting to be 
re-inserted in the tree. Data in the temporary storage area 
might not be organized meaningfully, and this could be one reason 
why dreams often have little internal logic. 

Discussion 

Thi s paper does not simply assume that humans and computers 
actually process information in an identical manner. Most 
importantly, computers have a single central processor that 
essentially only does one thing at a time. It seems most likely, 
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however, that humans process r&any items at the saute tir*e. 
Furthermore, computers typically operate by may of algorithms, 
and only laboriously and in a very limited fashion can they 
develop new algorithms. Humans, on the other hand, easily 
generate new rules and discard nonworkmg ones with almost as 
much ease. 

Nor does the model proposed in this paper necessarily imply 
a particular hardware, a particular neural anatomy. While some 
organizations of neurons might make k-d trees more 
straightforward there does not seem at the present to be any 
pattern that would preclude the existence of k-d trees. 
A relevant physiological question is whether or not brain 
activity propagates in a roughly hierarchical way, from one area 
to another area and then back again to the point of origin. 

While this paper has taken the position of describing human 
information storage in terms of a single main tree, that is not 
necessarily the case. It would seem reasonable that semantic, 
episodic and daily memory trees are rather unrelated, and may 
only share the initial root. They may even be entirely 
unrelated. Whatever the degree of relationship, the method of 
storage and retrieval of data are thought to be similar. 

This paper has offered a model from information science that 
may have strong relevance to the study of human cognition. The 
k-d model is efficient, ordered, and has the capacity for 
associative retrieval. It seems particularly relevant to the 
study of memory and f orgett ing. 
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